Development of a population‐level prediction model for intensive care unit (ICU) survivorship and mortality in older adults: A population‐based cohort study

Abstract Background and Aims Given the growing utilization of critical care services by an aging population, development of population‐level risk models which predict intensive care unit (ICU) survivorship and mortality may offer advantages for researchers and health systems. Our objective was to develop a risk model for ICU survivorship and mortality among community dwelling older adults. Methods This was a population‐based cohort study of 48,127 patients who were 50 years and older with at least one primary care visit between January 1, 2017, and December 31, 2017. We used electronic health record (EHR) data to identify variables predictive of ICU survivorship. Results ICU admission and mortality within 2 years after index primary care visit date were used to divide patients into three groups of “alive without ICU admission”, “ICU survivors,” and “death.” Multinomial logistic regression was used to identify EHR predictive variables for the three patient outcomes. Cross‐validation by randomly splitting the data into derivation and validation data sets (60:40 split) was used to identify predictor variables and validate model performance using area under the receiver operating characteristics (AUC) curve. In our overall sample, 92.2% of patients were alive without ICU admission, 6.2% were admitted to the ICU at least once and survived, and 1.6% died. Greater deciles of age over 50 years, diagnoses of chronic obstructive pulmonary disorder or chronic heart failure, and laboratory abnormalities in alkaline phosphatase, hematocrit, and albumin contributed highest risk score weights for mortality. Risk scores derived from the model discriminated between patients that died versus remained alive without ICU admission (AUC = 0.858), and between ICU survivors versus alive without ICU admission (AUC = 0.765). Conclusion Our risk scores provide a feasible and scalable tool for researchers and health systems to identify patient cohorts at increased risk for ICU admission and survivorship. Further studies are needed to prospectively validate the risk scores in other patient populations.

92.2% of patients were alive without ICU admission, 6.2% were admitted to the ICU at least once and survived, and 1.6% died.Greater deciles of age over 50 years, diagnoses of chronic obstructive pulmonary disorder or chronic heart failure, and laboratory abnormalities in alkaline phosphatase, hematocrit, and albumin contributed highest risk score weights for mortality.Risk scores derived from the model discriminated between patients that died versus remained alive without ICU admission (AUC = 0.858), and between ICU survivors versus alive without ICU admission (AUC = 0.765).

Conclusion:
Our risk scores provide a feasible and scalable tool for researchers and health systems to identify patient cohorts at increased risk for ICU admission and survivorship.Further studies are needed to prospectively validate the risk scores in other patient populations.
critical care outcomes, mortality, population health, risk

| INTRODUCTION
More than 5 million patients are admitted to intensive care units (ICUs) in the United States each year and utilization of critical care services is forecasted to increase given an aging population living longer and with more comorbidities.6][7][8][9] Given the morbidity associated with critical illness and PICS, novel approaches that can stratify the population by risk for critical illness are of growing scientific interest.Such population-based models for ICU admission, mortality, and survival past the index admission 7,10,11 could be used by researchers and health systems to design, study, and implement programs for patient populations at higher risk for critical illness and development of PICS.
Electronic health record (EHR) data offers the possibility of developing predictive, scalable, and generalizable models for prognostication. 123][24] Predictive risk scores that can discriminate among these outcomes hold potential to improve current care pathways in two ways: (a) early identification of older adults in a community or health system at highest risk for ICU admission allowing recruitment and follow up in cohort studies; and, (b) development of novel health services programs and infrastructure by critical care stakeholders to advance the care for populations at risk for future critical illness and PICS.With the above goals in mind, we conducted this study using EHR data from a large cohort of older adults seen in the primary care setting to develop and validate a multivariable and scalable risk prediction tool for ICU survivorship and mortality.Our goal was to develop a population-level prediction model to be utilized by researchers and hospital administrators.

Key Points
• Existing predictive models for critical illness often utilize clinical data from emergency room settings or assist in triage of patients admitted to the hospital.These predictive models do not differentiate outpatients who may develop critical illness in the future.
• Our study describes development of a predictive risk score to identify future predict intensive care unit (ICU) admission and ICU survival among community dwelling older adults seen for routine primary care.
• This risk prediction model performed well in identification of future ICU survivors, and may be a novel tool for researchers and health systems to recruit patients for clinical research.
Approval Type: Exempt), and all study activities were conducted in accordance with the Declaration of Helsinki.We report our methods and results in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guidelines. 25e study cohort comprised patients aged 50 years or older who had at least one visit with a primary care provider (PCP) at either of two urban, academic, health care systems (Eskenazi Health and Indiana University Health) affiliated with Indiana University School of Medicine, in Indianapolis, Indiana, between January 1, 2017 and December 31, 2017.This date range was chosen so that patients' subsequent 2-year outcomes in 2018−2019 could be used without the influence of the COVID-19 pandemic.Both health systems provide care to a socioeconomically, racially, and geographically diverse patient population in the Indianapolis region.We chose to develop our prediction model in older adults (age 50 and older) given their increased rates of critical care utilization and high rates of PICS.

| Data elements
We utilized the Indiana Network for Patient Care (INPC), a statewide EHR data warehouse, to identify patients meeting the eligibility criteria and perform the automated data extraction. 26INPC is the largest health information exchange in the United States with data comprising more than 13 million patients across the state of Indiana.
For each patient included in the cohort, we defined an index date as the patient's last PCP visit date in 2017.If a patient had more than one PCP visit during the study year (2017), we used the last PCP visit as the index date.We then retrieved structured EHR data including demographics, medical history (ICD-9 and ICD-10 codes), laboratory results, medication orders, and health care utilization (outpatient clinic visits, emergency room visits, hospitalizations, and ICU admissions) from 2 years before index date.We collected ICU admission and mortality data for 2 years after index PCP visit.
We first converted all ICD-9 codes into ICD-10 codes using a SAS data set downloaded from the Center for Medicare and Medicaid website.In logistic regression models with many predictors, binary predictors with low frequencies may lead to problems with the convergence of the likelihood function and unstable parameter estimates.Therefore, common practice for conducting logistic models with a large number of predictors is to exclude binary variables with small event frequencies.To prevent a potential model fitting problem which could result in unstable parameter estimates, medical conditions with less than 2% prevalence in the overall sample were excluded, resulting in 211 unique ICD-10 codes which were included in the development of our model (see Supporting Information: Table 1 for list of these ICD codes). 27We also included laboratory tests which are commonly obtained in the outpatient setting, such as complete blood count, comprehensive metabolic panel, lipids (total cholesterol, triglyceride, high-density lipoprotein, and low-density lipoprotein), and thyroid function (see Supporting Information: Table 2 for list of laboratory tests included in the model development).Laboratory values were dichotomized as normal or abnormal determined using published laboratory standards to minimize the potential influence of extreme values (see Supporting Information: Table 2).Patients who did not have a laboratory test within the 2-years before the index visit were treated as having a normal value.We used the medication Generic Product Identifier code to classify medications into drug classes (see Supporting Information: Table 3 for list of medications included in the model development) and excluded medications with less than 2% prevalence in the overall sample. 28

| Outcome measures
We categorized patients in the cohort into three mutually exclusive outcome groups.

| Alive without ICU admission
Patients who survived the 2-year follow-up period after index PCP visit and were never admitted to the ICU.

| ICU survivor
Patients who had at least one ICU admission within 2 years of index PCP visit, and survived for at least 30 days after the first ICU discharge.

| Death
Patients who died within 2 years of index time, including patients who died without an ICU admission, those who died during the first ICU admission, or those who died within 30 days after discharge from the first ICU admission.

| Model development
We compared patients' characteristics and medical conditions using analysis of variance for continuous measures and χ 2 tests for categorical measures among the three outcome categories using a 2-sided p value of <0.05.Multinomial logistic models with the three outcomes categories described above were used to identify patients' characteristics, medical conditions, laboratory results and medication classes that were associated with the three patient groups.We chose this approach for model development as our goal was to create risk scores that are meaningful and straightforward to calculate, differentiating our model from machine learning prediction tools which are often too complex to interpret. 29Given the large number of potential variables contained in EHR data, we used forward model selection with Schwarz's Bayesian Criterion (SBC) as the model selection criterion.In each iteration of a forward model selection process, we evaluated the impact of adding one predictor variable to the model and select the predictor with the lowest SBC.The SBC penalizes models for their complexity to prevent overfitting. 30,31 determine model validity, we performed cross-validation by randomly splitting the data into a derivation data set and a validation data set (60:40 split) ten times.Predictor variables were retained in our model if they were selected in at least 50% of models from the derivation data set.Parameter estimates from selected predictor variables were averaged across all 10 cross-validation models to create predictive risk scores for "death versus alive without ICU admission" and "ICU Survivors versus Alive without ICU Admission." All analyses were performed using SAS v9.4.

| Comparing predictive performance with three previous EHR-based risk scores
In the validation data sets, we compared our risk scores' predictive performance with three previous EHR-based risk scores (Elder Risk Assessment [ERA], Multimorbidity Frailty Index [MFI], and the Hospital Frailty Risk Score [HFRS]). 24,27,32The ERA index was derived using data from the Mayo Clinic EHR to predict critical illness (defined as sepsis, need for mechanical ventilation, or death) within 2 years of the index outpatient visit.The MFI incorporated 32 deficits based on EHR using Taiwan's National Health Record for the prediction of ICU admission.The Hospital Frailty Score was developed using ICD-10 codes that characterized frailty to predict mortality, prolonged hospital stay, and readmission among older adults.All risk scores were calculated in the validation data sets to measure model performance and calculate the average area under the Receiver Operator Characteristic (ROC) curves for ERA, MFI, HFRS and our two risk scores (the ICU Survivor Risk Score and the Mortality Risk Score).

| RESULTS
We identified an overall sample of 48,127 patients who were aged 50 years and older and had at least one PCP visit in 2017.The study cohort was 60.8% female (29,  4.Among the three groups, ICU survivors had the highest rate of diabetes mellitus type II while patients that died had the highest rates of chronic obstructive pulmonary disease (COPD), congestive heart failure (CHF), and abnormal chest imaging (see Supporting Information: Table 4).ICD diagnoses for encounters for screening for malignant neoplasms were most frequent in patients who remained alive and were never admitted to the ICU.Characteristics of the predictor variables were similarly distributed between derivation (n = 29,042) and validation (n = 19,085) samples (see Supporting Information: Table 5).

| Predictor variables associated with outcomes
Average parameter estimates and associated risk score weights for each of the retained predictors from the 10 testing data sets are shown in Table 1.We derived two risk scores: ICU Survivor Risk Score: score range 0−26, with higher scores indicating greater risk of ICU survivorship, and the Mortality Risk Score: score range 0−41, with higher scores indicating greater risk of death within 2 years.
In our models, greater deciles of age over 50, diagnoses of COPD and CHF, and laboratory abnormalities in serum alkaline phosphatase, hematocrit, and albumin contributed the highest risk score weights for mortality, while ICD-10 diagnoses of encounter for screening for malignant neoplasms, and vasomotor or allergic rhinitis were associated with lower risk of death.In contrast, diagnoses of peripheral vascular disease and COPD were associated with ICU survival (see Table 1).Abnormal values for creatinine, red blood cell count, sodium, and diagnoses of atrial fibrillation/atrial flutter were retained predictors in risk scores for death and for ICU survivors.

| Performance of predictive risk scores
We measured the predictive performance of our risk scores in ten validation samples (n = 19,085 in each sample).In Table 2, we present the average AUCs (and range) in overall performance and separately for three outcome groups.We also include mean AUCs for other electronic prediction tools (ERA, MFI, and HFRS) in Table 2.
Our predictive risk scores performed better than ERA, MFI, and HFRS in differentiating among any of the three possible outcomes (death vs. alive without ICU admission, ICU survivor vs. alive without ICU admission, and ICU survivor vs. death).In particular, our predictive risk scores had mean AUC of 0.766 in discriminating ICU survivors from those alive without ICU admission and mean AUC of 0.854 in predicting mortality versus alive without ICU admission.
ROC curves for our predictive risk scores, ERA, MFI and HFRS in one validation sample are presented in Figure 1.
In Table 3, we provide positive predictive values (PPV) for the risk scores using suggested cut-off points to categorize patients into high, medium, and low risk groups for ICU survivorship and death in one validation sample.Patients in the high-risk group for ICU survivors (ICU survivor risk score ≥18) had the highest proportion of ICU survivorship (PPV = 33.8%compared to 18.1% in those with scores 11−17 and 4.2% in patients with score ≤10).Patients in the high-risk group for death (Death risk score ≥22) also had the highest proportion of death (PPV = 13.9% compared to 5.7% in those with scores 16−21 and 0.8% in patients with score ≤15).We note in the overall cohort, rates of ICU survivorship and death are 6.2% and 1.6%, respectively.
T A B L E 1 Results of multinomial logistic model comparing death and ICU survival to no events in 10 derivation samples.We developed two EHR-based predictive risk scores (ICU Survivor Risk Score and Mortality Risk Score) to identify community dwelling older adults at risk for ICU admission/survivorship or death.In contrast with other recent studies, we categorized ICU survivorship as an independent outcome group with the objective of identifying older adults in the population at risk for ICU admission.Our models are intended to assist hospitals, health systems, and researchers identify cohorts of patients at higher risk for critical illness survivorship (i.e., patients at risk for PICS) and mortality, rather than for bedside clinical prognostication at the individual patient level.We found that our model's risk scores outperformed previously validated EHR-based risk predictions in ICU survival (AUC 0.765) and death (AUC 0.858) within 2-years.
In recent years, a growing number of models that predict patientlevel outcomes have been published.4][35][36] The goals of our predictive models were instead to utilize EHR data in the pre-ICU phase to identify groups of patients at increased risk for future ICU admission and survival within 2 years.In this paper, we present the development of two practical and scalable risk scores that focus on earlier detection of patients at risk for poor outcomes, that is, in community dwelling adults in the "pre-ICU phase." In the process of development of the risk scores, we also identified the variables associated with increased risk of mortality.
These predictors included older age, diagnoses of COPD, CHF, and laboratory abnormalities in multiple organ systems (such as liver, kidney, and blood). 13,14By their design, however, prior studies relied on in-hospital or in-ICU clinical data, whereas our study includes variables that may be predictive of mortality even when measured in the outpatient, pre-ICU phases of care.
We also identified variables in our risk score which were associated with ICU survival.Diagnoses of COPD, CHF, and PVD were all associated with ICU admission/survival.In addition, diagnoses of vasomotor/allergic rhinitis had lower risk of mortality Our study has several strengths.We utilized a comprehensive medical record database with diverse socioeconomic and racial representation.Second, our study's large sample size allowed the use of cross-validation for the development and testing of the predictive models so that both testing and validation samples are sufficiently large to provide stable estimates.Third, model selection was based on collective model performance criteria instead of significance of single variables.Finally, we compared our predictive models to several current EHR-based risk scores to demonstrate the superior performance of our model prediction.Our study also has limitations associated with the limitation of EHR data.First, the EHR data did not include some variables known to be associated with mortality such as socioeconomic status (including the level of education), nutrition status or physical activity levels.Second, our model is developed to predict outcomes among older adults, and therefore its validity in younger adults at risk for PICS is not yet known.Third, as our inclusion criteria required an index PCP visit, how well the model identifies patients who were never seen in primary care before critical illness is not known.Finally, our model's limited performance in differentiating between the groups of patients who will survive the ICU versus those who die suggests additional data input is needed for improving predictive accuracy between these two outcome groups.In summary, we developed and validated risk scores to predict outcomes in older adults.Our models perform well in identification of patients at risk for death, and those who will survive after ICU admission.The future scope of our work includes confirmation of the Predictive Risk Scores in large, prospective cohort studies.Implementation of the risk scores within population health settings may enable early identification of older adults at risk for critical illness, allowing researchers and health systems to design and test novel interventions to optimize post-ICU recovery.
Medicaid insurance (7299/48,127).During the 2-year follow-up period, 92.2% of patients were alive and never admitted to the ICU (44,370/48,127), 6.2% were admitted to the ICU at least once and survived (2994/48,127), and 1.6% died (763/48,127).Rates of demographics, as well as variables selected for the risk scores (ICD-10 diagnoses, laboratory values, clinical testing, and health care utilization before the index primary care visit by the outcome groups) are shown in Supporting Information: Table

1
Receiver Operating Characteristic (ROC) Curves of Predictive Risk Scores Compared to Elder Risk Assessment (ERA), Multimorbidity Frailty Index (MFI), and Hospital Frailty Risk Score (HFRS) in one validation sample (n = 19,085).Elder Risk Assessment, ERA shown in blue.MFI shown in red.HFRS shown in green, Predictive Risk Score shown in brown.
T A B L E 3 Numbers (proportions) of patients in each outcome group using cut-off predictive risk scores, and positive predictive values, in one validation sample (n = 19,085).